745 research outputs found
Precipitation and removal of ionic compounds from produced water: observed versus modeling results
2014 Spring.Includes bibliographical references.Produced water is generated during the hydraulic fracturing and drilling process, and is regarded as the largest byproduct associated with oil and gas industrial development. Samples of produced water from wells near Greeley, Colorado, were collected from February - July 2013. Commercial produced water treatment at the laboratory scale was conducted and the results compared to computer-based software modeling predictions. Different parameters, such as pH and temperature, are adjusted in order to test how these parameters could affect the treatment for produced water softening. The study shows that removal treatment performance could be related to pH adjustment of coagulation process, temperature and to the size of the filtration membrane. Comparison between different membrane filtration size (2.5 micron and 0.2 micron) apparently shows finer membrane (0.2 micron) improves the removal treatment performance. The results indicate that precipitation is not the limiter to divalent cation removal. During the research, OLI Chemical Analyst, the computer based modeling program, analyzed the precipitation performance of water samples under different temperature (-15 °C - 25 °C) and pH (9.0 - 10.2) conditions. The OLI Chemical Analyst shows that lower temperature could precipitate out different species. Sodium ions get separated (as NaAl(OH)2CO3, aluminum di-hydroxide carbonate) from the inflow when temperature is lower than 10°C, while other metal ions, such as calcium ions, barium ions, cannot get removed efficiently. However, the modeling results of pH adjustments demonstrate that lower pH would not obviously affect the scaling tendency of the target salts. The results show magnesium ions can only get removed when pH is higher than 11.0, the pH adjustment for softening can be optimized
Essays on distance metric learning
Many machine learning methods, such as the k-nearest neighbours algorithm, heavily depend on the distance measure between data points. As each task has its own notion of distance, distance metric learning has been proposed. It learns a distance metric to assign a small distance to semantically similar instances and a large distance to dissimilar instances by formulating an optimisation problem. While many loss functions and regularisation terms have been proposed to improve the discrimination and generalisation ability of the learned metric, the metric may be sensitive to a small perturbation in the input space. Moreover, these methods implicitly assume that features are numerical variables and labels are deterministic. However, categorical variables and probabilistic labels are common in real-world applications. This thesis develops three metric learning methods to enhance robustness against input perturbation and applicability for categorical variables and probabilistic labels. In Chapter 3, I identify that many existing methods maximise a margin in the feature space and such margin is insufficient to withstand perturbation in the input space. To address this issue, a new loss function is designed to penalise the input-space margin for being small and hence improve the robustness of the learned metric. In Chapter 4, I propose a metric learning method for categorical data. Classifying categorical data is difficult due to high feature ambiguity, and to this end, the technique of adversarial training is employed. Moreover, the generalisation bound of the proposed method is established, which informs the choice of the regularisation term. In Chapter 5, I adapt a classical probabilistic approach for metric learning to utilise information on probabilistic labels. The loss function is modified for training stability, and new evaluation criteria are suggested to assess the effectiveness of different methods. At the end of this thesis, two publications on hyperspectral target detection are appended as additional work during my PhD
What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing
Driven by new software development processes and testing in clouds, system
and integration testing nowadays tends to produce enormous number of alarms.
Such test alarms lay an almost unbearable burden on software testing engineers
who have to manually analyze the causes of these alarms. The causes are
critical because they decide which stakeholders are responsible to fix the bugs
detected during the testing. In this paper, we present a novel approach that
aims to relieve the burden by automating the procedure. Our approach, called
Cause Analysis Model, exploits information retrieval techniques to efficiently
infer test alarm causes based on test logs. We have developed a prototype and
evaluated our tool on two industrial datasets with more than 14,000 test
alarms. Experiments on the two datasets show that our tool achieves an accuracy
of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by
up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per
cause analysis. Due to the attractive experimental results, our industrial
partner, a leading information and communication technology company in the
world, has deployed the tool and it achieves an average accuracy of 72% after
two months of running, nearly three times more accurate than a previous
strategy based on regular expressions.Comment: 12 page
Learning Local Metrics and Influential Regions for Classification
The performance of distance-based classifiers heavily depends on the
underlying distance metric, so it is valuable to learn a suitable metric from
the data. To address the problem of multimodality, it is desirable to learn
local metrics. In this short paper, we define a new intuitive distance with
local metrics and influential regions, and subsequently propose a novel local
metric learning method for distance-based classification. Our key intuition is
to partition the metric space into influential regions and a background region,
and then regulate the effectiveness of each local metric to be within the
related influential regions. We learn local metrics and influential regions to
reduce the empirical hinge loss, and regularize the parameters on the basis of
a resultant learning bound. Encouraging experimental results are obtained from
various public and popular data sets
- …